Implementation and testing of Lanczos-based algorithms for Random-Phase 

Approximation eigenproblems. 
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The treatment of the Random-Phase Approximation Hamiltonians, encountered in different frame- 
works, like Time-Dependent Density Functional Theory or Bethe-Salpeter equation, is complicated 
by their non-Hermicity. Compared to their Hermitian Hamiltonian counterparts, computational 
methods for the treatment of non-Hermitian Hamiltonians are often less efficient and less stable, 
sometimes leading to the breakdown of the method. Recently [Griining et al. Nano Lett. 8, 2820 
(2009)], we have identified that such Hamiltonians are usually pseudo-Hermitian. Exploiting this 
property, we have implemented an algorithm of the Lanczos type for random-Phase Approximation 
Hamiltonians that benefits from the same stability and computational load as its Hermitian coun- 
terpart, and applied it to the study of the optical response of carbon nanotubes. We present here the 
related theoretical grounds and technical details, and study the performance of the algorithm for the 
calculation of the optical absorption of a molecule within the Bethe-Salpeter equation framework. 
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I. INTRODUCTION 

The Random-Phase Approximation (RPA) Hamilto- 
nian iJ RPA appears in several areas of physics and theo- 
retical chemistry, and describes strong collective excita- 
tions of a many-body system as the linear combination 
of particle- hole pairs |A/z}i>£. It has the form 



H 



RPA 



R C 
-C* -R* 



(1) 



where the resonant R and anti-resonant — R* blocks 
are Hermitian matrices in the subspace generated 
by particle-hole pairs propagating respectively forward 
(|A/i)) and backward (|//A)) in time (in what follows a, A 
indicate particles and (3,fi holes), and the C and — C* 
blocks are symmetric matrices coupling the particle-hole 
pairs propagating forward and backward in time. The ex- 
citation energies and strengths of the many-body system 
are the eigensolutions of Eq. ([IJ. Note that the RPA 
Hamiltonian is not Hermitian, thus its eigenvalues are 
not necessarely real. 

In quantum chemistry, condensed matter physics, 
nanoscience, or nuclear physics, the RPA Hamiltonian 
appears within the state-of-the-art approaches for calcu- 
lating the excitations in an electronic system: the time- 
dependent density functional theory 3 (TD-DFT) and 
the Bethe-Salpeter 4 (BS) equation. 5 In the commonly 
used approximations to TD-DFT (e.g. real exchange- 
correlation kernel) and BS equation (static screening 
of the interaction), all the eigenvalues are real. TD- 
DFT is particularly successful for finite systems, namely 
molecules and molecular clusters, while the BS approach 
is mostly used for extended systems, like periodic bulk 
solids and, in general, for systems where excitonic ef- 



fects play an important role 6 . Nowadays, the appli- 
cation of these approaches to the computation of the 
time-dependent responses of more and more complex 
systems, such as large bio-molecules or nanostructures, 
poses the problem of efficient solution of the eigenprob- 
lem for iJ RPA . For large matrices, the direct diagonal- 
ization is usually not possible, and one has to resort to 
iterative algorithms, such as the Lanczos method. Such 
algorithms exist for both Hermitian and non-Hermitian 
Hamiltonian. However, compared to their Hermitian 
Hamiltonian counterparts, algorithms for the treatment 
of non-Hermitian Hamiltonians are often less efficient and 
less stable, sometimes leading to the breakdown of the 
method^. 

Within TDDFT, the very convenient Hermitian for- 
mulation of the eigenvalue problem proposed by Casida 9 
exists. However its application is limited to finite sys- 
tems and purely local effective potentials for which the 
jjRPA j g rea i rp ne p resence f e g spin-orbit cou- 
pling prevents the application of Casida's approach. In 
general a further approximation is introduced, the so- 
called Tamm-Dancoff approximation (TDA), that consid- 
ers only particle-hole pairs propagating forward in time, 
so that the TDA Hamiltonian corresponds just to the 
resonant part, i? TDA = R. The TDA is often sufficiently 
accurate, as in the case of optical absorption spectra of 
periodic bulk systems. On the other hand, the TDA 
becomes inaccurate or even unphysical in the case of 
electron-energy-loss spectral, reflectivity spectral, and 
also for the optical absorption of low-dimensional sys- 
tems, e.g. nanosystems or 7r— conjugated molecule o 12 ' 13 . 
In a previous worki^ we have implemented an approach 
for the solution of the RPA Hamiltonian, that avoids the 
TDA, and still benefits of the efficiency and robustness 
of the algorithms for the Hermitian case. This approach 
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has been already successfully applied to the calculation 
of the optical absorption and energy-loss spectra of a car- 
bon nanotube. While our previous work focussed on the 
implications of the TDA for nanoscale systems, in this 
work the focus is on the theoretical grounds and some 
more technical aspects of that approach. We show here 
how the Lanczos algorithm for Hermitian eigenproblem 
(Sec. Ill A[) can be used for the RPA Hamiltonian, that 
is pseudo- Hermitian with real eigenvalues (Sec. IIIC|) . by 
simply redefining the inner product (Sec. [TO}- We ob- 
tain (Sec. IIII B|) the generalization to complex matrices 
of the scheme proposed by Van der Vorst in Ref. [H]. The 
obtained algorithm is then further specialized (Sec. IIII C|) 
to the calculation of the macroscopic dielectric function 
(from which the optical absorption and energy-loss spec- 
tra are derived) and finally applied to the calculation 
of the optical response of the trichloro-bezene isomers 
within the BS equation framework (Sec. lIVp . to show the 
algorithm accuracy fSec. UVBj) and efficiency (Sec lIVCj) . 



II. MATHEMATICAL BACKGROUND 



where 



aj = (lj\H\qj), (4) 
bj+i = \\Q j+ i\\, (5) 
\qj+x) = \Q j+ i)/b j+ i. (6) 

The algorithm is schematically described in Fig.(JT]). In 

b <= ||"o|| (A) 

\s) ^ K)/6 (B) 

|t) <= H\s) (C) 

6i <^ 0, \r) <= (D) 
i <t= 1 

i- until CONVERGENCE do 

a,i <S= (s\t) (E) 

\t) <= \t) -ai|s) -h\r) (F) 

k+i «= 1*11 (G) 

|r) 4= \s) (H) 

\s) <= \t)/b i+1 (I) 

|t)«= H\s) (K) 
i <=i + l 



This section reviews briefly the two key "ingredi- 
ents" of the presented approach: the Lanczos method 
for the solution of (non-)Hermitian eigenproblems, and 
the definition of pseudo-Hermitian matrix. The Lanczos 
method allows to calculate by recursion the eigenvalues, 
and eigenvectors, or directly the response spectrum, of 
large matrices. The pseudo-Hermicity is related to the 
reality of the eigenvalues of a matrix and with the possi- 
bility of transforming the matrix into a Hermitian matrix. 



A. Lanczos method for Hermitian eigenproblems 

The Lanczos recursion method 7 is a general algorithm 
for solving eigenproblems for a Hermitian operator H. 
This algorithm recursively builds an orthonormal basis 
{|gi)} (Lanczos basis) in which H is represented as a real 
symmetric tridiagonal matrix, 



T 



fai b 2 

b 2 a 2 

'•• 

\0 ••• 




b 3 







bk-i afc-i bk 
b k a k J 



(2) 



The first vector \qi) of the Lanczos basis is set equal to 
a (normalized) given vector |uo)/||ito||- The next vectors 
are calculated from the three-term relation 



FIG. 1: Hermitian Lanczos algorithm 

steps (A)-(D) the variables are initialized before enter- 
ing the conditional loop [steps (E)-(K)]. Here, at each 
iteration a new vector of the Lanczos basis is computed 
till the convergence criteria is met. The cost per itera- 
tion is given mainly by the matrix-vector multiplication 
at step (K), that is of 0(N 2 ) for non-sparse matrices, 
with N the size of H . In terms of memory and storage, if 
one is just interested in the eigenvalues, at each iteration 
only three vectors (\q n -i), \<ln)i lln+i)) are needed, and 
only two reals (ai,6i) need to be stored. At the end of 
the process one gets the tridiagonal matrix of Eq. ^ of 
dimension k x k, that can be diagonalized with a cost 
oc k. Compared with the standard diagonalization, the 
advantages are the memory usage, and the computational 
cost oc kN 2 (for diagonalization is 0(N 3 )) as soon as the 
number of iterations k N. This is in practice always 
the case when we are interested only in a portion of the 
spectrum of HJ£- 

As first highlighted by Haydock 16,17 , an additional ad- 
vantage of Lanczos recursive approach is the possibility 
of calculating the resolvent {uj — H)^ 1 matrix elements, 
bypassing completely the diagonalization. In fact the re- 
solvent for the state \uq) takes the form of a continued 
fraction 



(uo|(w - HY^uo) = \\u \\' 



1 



(u) - ax) - 



(uj - a 2 ) 



\Q j+ i) = Z%) - aj\qj) - bjlqj^t}, 



(3) 



Other matrix elements can be then calculated by recur- 
sion (see IA"1) . 
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B. Lanczos method for non-Hermitian 
eigenproblems 

The Lanczos recursive approach can be extended to 
the non-Hermitian case£. For a non-Hermitian matrix H, 
that we suppose diagonalizable, the action on a ket \v) 
differs from the action on a bra (v\: no orthogonal basis 
set exists, that could transform it into a diagonal form. 
The most straightforward extension of the Lanczos proce- 
dure illustrated in the previous subsection is the Arnoldi 
recursive approach that transforms H into an upper- 
Hessenberg matrix, instead of a tridiagonal one, and thus 
presents clear computational disadvantages with respect 
to the Hermitian caseJ^ 

It is still possible to tridiagonalize H, by defining a bi- 
orthonormal Lanczos basis |<?i)}, that is (Pi\qj) = 

5ij while in general, {|<?i)} are not orthogonal. In 

this basis, H is represented as a non-Hermitian tridiago- 
nal matrix 



/ai b 2 



j<ij — 







C2 02 b 3 

o '•. '■• 











a 7 -_i bj 



"3- 
Ci 



(8) 



The first vectors, (pi|, \q\) of the Lanczos basis are set 
equal to given bi-orthonormal vectors (wq\, \uq). The 
next vectors are calculated from the three-term relations 



\Qj+i) ={H - a j)Wj) - CjWj-i)- 



where 



Oj+l 



&\H\ qj ), 

= \\Q 3 +il 

c j+i = (Pj+i\Qj+i)/bj+i, 
(Pj+ll = {Pj+l\/Cj+l, 

kj+i) = \Qj+i)/ b j+i- 



(9) 
(10) 

(11) 
(12) 

(13) 

(14) 

(15) 

Figure [2] schematically describes the algorithm corre- 
sponding to Eqs. (|9llip . The number of steps is doubled 
with respect to the Hermitian case, since we build two 
sets of basis vectors, so that each operation performed 
for \qi), has to be repeated for the (pi\. Another im- 
portant difference with respect to the Hermitian case is 
that, since {!<&)} are not orthogonal, it can hap- 

pen that one of the new vectors that are built in Eqs. © 
are zero or their vector product is zero. Then at each it- 
eration, one has to check, together with the convergence 
conditions, that the algorithm does not break downJ^ 

Also the Lanczos-Haydock (LH) procedures, Eq. (|7J), 
can be generalized (bf are replaced by Cj&j). Indeed, the 
non-Hermitian LH has been applied to the calculation 
of the dynamical polarizabilities of molecules within the 
time-dependent density functional and Bethe-Salpeter 
perturbation theory schemes introduced in Refs. I20l - t22l 
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FIG. 2: Non-Hermitian Lanczos algorithm [Eqs. (|9ll5j) ]. The 
steps that are added with respect to the Hermitian case in 
Fig. ([2]), are tagged with a primed letter. See text for further 
explanations. 



C. Pseudo-Hermicity 

The Hermicity of an operator H, that is H = H\ 
insures the reality of the spectrum of H. However, Her- 
micity is a sufficient, but not necessary condition for the 
reality of the spectrum of an operator. To study system- 
atically non-Hermitian Hamiltonian with real spectrum, 
quite recently, Mostafazadeh 23 introduced the concept of 
pseudo-Hermicity. The ?7-pseudo-Hermitian adjoint of H 
is defined as 



H 3 



V 



V 



(16) 



where r\ is an invertible transformation in Hilbert space. 
Then, a Hamiltonian is ?7-pseudo-Hermitian if H* = H. 
It is easy to recognize that this definition includes Her- 
micity (r) = I). 

Using this concept, it is possible to define a sufficient as 
well as necessary condition for the reality of the spectrum 
of a matrix. It can be proved^ that H diagonalizable has 
a real spectrum if and only if H is ^-pseudo-Hermitian 
and there exists an operator O with -q = OO^ . Since r\ is 
positive-definite one can define an inner product, 



(•!•), == <#>, 



(17) 



and the corresponding norm || • Then, with respect 
to this inner product, H is Hermitian (alternatively O 
transforms H into an Hermitian Hamiltonian). 
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III. PSEUDO-HERMICITY OF THE RPA 
HAMILTONIAN 

In this section we first show that the RPA Hamiltonian 
is pseudo-Hermitian, and introduce an operative way to 
define the inner product with respect to which H RPA is 
Hermitian. As a consequence, the Lanczos algorithm for 
Hcrmitian operators presented in the previous section is 
simply reformulated using this inner product. Finally the 
algorithm is specialized for TD-DFT and BS calculations 
of excited state properties of electronic systems. 

A. Redefinition of the inner product for RPA 
Hamiltonian 

The RPA Hamiltonian H can be written a o 2 ' 25 

H = FH=§ £). (18) 

In Eq. ([TS]) . F is a Hermitian involution, and H a Her- 
mitian matrix (indeed R is Hermitian, and C is symmet- 
ric). H is pseudo-Hermitian with respect to both F and 
H . In fact, since F is an involution, H = FH, and the 
F-pseudo-Hermicity, FH = H'F follows directly from 
the Hermicity of H. Note that the F-pseudo-Hermicity 
corresponds to the invariance of the Hamiltonian with 
respect to FK, with K is the complex conjugation ma- 
trix, or more physically to the invariance of the Hamilto- 
nian under the combined action of parity (F) and time- 
reversal (K) operator. However, since F is clearly non 
positive definite, it cannot be used for defining a norm. 

More interestingly, the F-pseudo-Hermicity is also eas- 
ily proved 

HH =FHH (H definition) 

=WFH (H F-pseudo-Hermicity) 
=H f H (H definition) □. 

The positive definitiveness of H is intimately related to 
the nature of the independent particle solution |$) from 
which the particle-hole pairs are built. In fact H is the 
curvature tensor, or stability matrix, that one obtains 
when expanding the energy surface around the stationary 
point |4>) for small variations |<5<j>). If |$) corresponds to 
a minimum in the energy surface, then H is positive def- 
inite^ For excitations of an electronic system, in connec- 
tion with the singlet-triplet instability, Zimmermann 25 
found that a sufficient condition for H to be positive def- 
inite is the smallest particle-hole pair energy being larger 
than the modulus of the largest matrix element of the 
coupling matrix C . 

Except in very rare cases, in practical applications the 
|$) corresponds to a minimum in the energy surface, 
hence we can use H to define a modified inner product, 
with respect to which H is Hermitian 

(■\-)h := {■]£■), (19) 



and the associate norm. With this inner product, the 
closure relation for a complete set of functions {^n}, or- 
thonormal with respect to the modified inner product, 
has to be defined accordingly as 

J2mn)(*n\=L (20) 



B. Lanczos recursive method for RPA 
Hamiltonians 

Using the properties in previous section, we can re- 
design the Lanczos recursive procedure (Sec. Ill Ap sim- 
ply by replacing the standard inner product with the one 
defined in Eq. ([T9")) . The result is the algorithm proposed 
in Reflbl generalized to complex matrices. The main 
question is whether the replacement of the inner prod- 
uct is convenient numerically, with respect to the stan- 
dard non-Hermitian Lanczos method. Indeed the new in- 
ner product implies a matrix- vector multiplication by H, 
and thus, as in the non-Hermitian Lanczos, the computa- 
tional cost is doubled. However, taking into account that 
H and H are related by F, effectively only one matrix- 
vector multiplication is needed as in the Lanczos method 
for Hermitian matrices. In fact, 

a, = ( qj \H qj ) R =( qj \H^FH\ qj ) = ((H qj )\F\(H qj )), 

(21) 

bj+i = WQo+xWh =y/(Q j+ i\F\(HQ j+1 )), (22) 

so that only one matrix-vector multiplication by H ap- 
pears, while computing the matrix elements of F will 
be unexpensive, because F corresponds to the identity in 
the subspace of particle-hole propagating forward in time 
and to minus the identity in the subspace of particle-hole 
propagating backward in time. The Lanczos algorithm 
so modified is shown in Fig. [3] With respect to the Her- 
mitian case both in the variable initialization [steps (A) 
to (F)], and in the conditional loop [steps (G) to (O)] 
the algorithm is rearranged: the bi factors [steps (C) and 
(M)] are calculated after the matrix-vector multiplica- 
tion [steps (B) and (L)]; as a consequence two additional 
steps are added to normalize the Lanczos basis vectors 
[steps (D)-(E), (N)-(O)]. The reordering, and addition of 
the extra (computationally inexpensive) steps avoids the 
additional matrix-vector multiplication due to the modi- 
fied inner product. The cost is then, as in the Hcrmitian 
case, of one vector-matrix multiplication. Coming to the 
diagonal matrix elements of the resolvent for the state 
\uq) [Eq.([7J) in the Hermitian case], the straightforward 
generalization (that is substituting the standard with the 
H-inner product) gives 

(u \(w - #) _1 Uo)jj = \\u \\ 2 s - 2 . 

(Ul-ai) 

{u - a 2 ) 

(23) 
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|>) <= \u ) (A) 

\t)^H\s) (B) 

b <= x/HfW (c) 

l«> *= W/fc> (D) 

|t) |i>/6 (E) 

6j 0, |r> <= (F) 
i <= 1 

p until CONVERGENCE do 

Oj <^= (t|F|t) (G) 

WH^-OiM-Mr) (H) 

|r) <= |*) (I) 

|s) 4= |t) (K) 

|t) 4= ff| S > (L) 

h+i y/{"\F\t) (M) 

|s) * |s)/fc i+1 (N) 

|t) <= |*>/&*+i (O) 

i 4= i + 1 



FIG. 3: Pseudo-Hermitian Lanczos algorithm 

However, we are still interested in the matrix element 
calculated with the standard inner product. Using the 
closure relation [Eq. (|20))]. we expand (wo\(u> — H)^ 1 ^) 
in terms of of (qi\(uj — H)~ 1 \uq) h, 

(wqKu-H)- 1 ^) = Y,(wo\qi}(q i \(u-H)- 1 u ) B . (24) 

i 

The first element of the sum oc (uq\(lu — H) Uq) g is 
given by Eq. (|2"5)l , while the other elements are found by 
recursion (see The standard product (woki), can be 
computed at each iteration of the Lanczos process and 
stored. The elements with i ^ 1 are different from zero, 
since the Lanczos basis vectors are orthonormal with re- 
spect to the //-inner product, not with respect the stan- 
dard one. On the other hand for the cases we studied we 
found that in fact the summation in Eq. (l24l) converges 
rapidly. 

C. Special case of calculation of the macroscopic 
dielectric function 

The Lanczos approach introduced in the previous sub- 
section can be applied to calculate the macroscopic di- 
electric function cm(w) within either the BS or TD-DFT 
framework. As shown in Ref. [26| the £m(w) can be rewrit- 
ten as 

e M {io) = l-(P\(u-H)- 1 F\P), (25) 

where \P) is a ket whose components in the \\fJ-) and \fJ-X) 
space are the optical oscillators: (P\X^) oc (A|d-£|^), with 
d the electronic dipole, and £ the light polarization factor. 
The components of the |P) vector in a particle-hole pair 
basis have the following symmetry 

(Ml-P) = «A/i|P)r • (26) 



Eq. (|25|) is straightforwardly calculated from Eq. ([24]) 
choosing |u ) = F\P). The components of |mo) vector in 
a particle-hole pair basis have the following symmetry 

(H\\u ) = - ((\n\u ))* . (27) 

The symmetry property of \uo) [Eq. (|27|) ] can be used 
to further reduce the computational load of the pseudo- 
Hermitian Lanczos algorithm. Within the vector space 
spanned by the particle-hole pairs let's consider the sub- 
space V_ of vectors with the property Eq. (|2"Tj) . and the 
subspace of vectors V+ with the property Eq. (f2"6"|) . The 
two subspace are clearly non-overlapping. It can be ver- 
ified that H transforms a vector v G V+(-) in a- vector 
w belonging to the same subspace, while F transforms a 
vector v G ^+(-) m a vector v' belonging to the other sub- 
space. As a consequence H transforms a vector v G V+(-) 
in a vector w' belonging to the other subspace. Also, we 
note that when qj belongs to either one of the subspace, 
the product in Eq. (f2"Tj) is zero. 

Next, using Eq. ([3]) it can be demonstrated by induc- 
tion that if the first Lanczos basis vector \qi) belongs to 
one of the subspaces, the vectors of the Lanczos basis set 
belongs to either one of the subspaces depending on the 
parity of the iteration j. More specifically if — as it is our 
case — |<7i) G V_, then all \qj) with j odd belong to V_, 
and all qj with j even belong to V+. In fact, if qi G V- 
then 52 G V+ (base case). Furthermore, suppose that in 
Eq. ©, \qj) G V_, and \qj-i) G V+, then Q j+i (thus the 
normalized qj+i) belong to V+ and Qj+2 [obtained by 
applying again Eq. © after updating \q n -i) ,\q n ) ,\q n +i)] 
to V_ (inductive step). 



\s) <= \\fi)(\fi\uo) 


(A) 


\t) <= r\s) - c\*y 


(B) 


b <^ ^2Re{s\t) 


(C) 


I s ) <= \ s )l h a 


(D) 


\t) <= \t)/b 


(E) 


6i <^ 0, |r) <= 


(F) 


i <= 1 






p until CONVERGENCE do 






|t) <^ \t)-bi\r) 


(G) 




\r) <= |«) 


(H) 




|«) <= \t) 


(I) 




\t) ^R\s) + (-l) i+1 C\sy 


(K) 




b t+1 <= ^2Re{s\t) 


(L) 






(M) 




\t) \t)/b i+ i 


(N) 




i •«= i + 1 









FIG. 4: Pseudo-Hermitian Lanczos algorithm modified to ac- 
count for a starting vector \P) with the anti-Hermitian prop- 
erty given by Eq. (|27|l 



Figure |4] shows the modified Lanczos algorithm for the 
case in which the starting vector has the symmetry prop- 
erty given by Eq. (j2"7| . Since half of the components of 
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\uq) contains all the information, the first Lanczos vector 
[step (A)] is just the projection \u Q ) on the |A/i) subspace 
(that is only particle-hole pairs propagating forward in 
time). Then, if H RPA is a N x N matrix, the algorithm 
works with vectors and matrices of dimension N/2 and 
N/2 x N/2 respectively, like in the case of the TDA ap- 
proximation. The multiplication by iJ RPA , that is the 
operation determing the cost of the algorithm, is replaced 
by two matrix- vector multiplications for matrices of half 
the size [steps (B) and (K)], so that the cost is reduced 
by a factor two compared to the full matrix casei^l 

Similarly, the H-norm of a vector defined in Eq. (f2"2"j) is 
rewritten as in steps (C) and (L) of Fig. [4j Furthermore, 
since all Oi are zero by symmetry they do not need to 
be calculated and can be omitted [step (G)] in the three- 
terms relation [Eq. ©]. 

IV. RESULTS 

We applied the algorithms described in Figs. ([1]) 
and ((4]) to the calculation of the optical absorption of the 
isomers of the trichlorobenzene (TCB) molecule within 
the BSE framework. First, we analyze the effect of the 
TDA on the spectra of the isomers, then we assess the 
accuracy and efficiency of the algorithms. 

A. Effect of the particle- hole hole- particle coupling 

Within the BSE framework the R and C matrix ele- 
ments are defined as 

Raj3,X^ — {E/3 — E a )5 a \5p^ + Kafi^Xfj,, (28) 
Cap^x = K al 3^x, (29) 

where E a ,Ep are the (quasi)particle/hole energies. The 
BS kernel (see e.g. Ref. M) 

K a p,\n = v a p^ - W a /3,XiJ,, (30) 

describes the interaction between the particle-hole pairs 
in terms of particle-hole exchange Sq^a^ (v is the Hartree 
potential without the long-range component) and at- 
traction Wo^A/i (W is the screened interaction). The 
particle-hole exchange is responsible of the local-field ef- 
fects, that is the effect of the induced microscopic electric 
fields. The particle-hole attraction introduces instead the 
so-called excitonic effects. 

In our calculations the basis of particle-hole pairs 
{I7A}} has been obtained from the solution of the 
Kohn-Sham equation for the system. The Kohn- 
Sham orbital energies has been corrected to obtain the 
(quasi)particle/hole energies E a , Ep by using the GW 
approximation. The Kohn-Sham calculations have been 
performed using the pseudo-potential plane-wave code 
ABINIT— , while the GW and BSE calculations have been 
performed using the yambo code 2 - where the algorithms 
described in this work have been implemented. 
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FIG. 5: [Color online] Optical spectra of TCB isomers: 1,3,5- 
TCB (top panel), 1,2,3-TCB (middle panel) , 1,2,4-TCB (bot- 
tom panel). The imaginary part of the macroscopic dielec- 
tric function (in arbitrary unit) calculated within BSE using 
both full (black solid line) and TDA (red dashed line) are 
compared with the gas-phase experimental absorption cross 
section spectra (blue dotted line). The energy and relative 
strength of the excitations for full BSE (black squares) and 
TDA-BSE (red circles) obtained by exact diagonalization are 
reported in linear scale (the largest intensity is normalized to 
!)• 
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Figure [5] shows the optical absorption spectra of 1,3,5- 
TCB, 1,2,3-TCB and 1,2,4-TCB within the BSE obtained 
both by LH and exact diagonalization of either the TDA 
or the full Hamiltonian. Results are compared with the 
gas-phase experimental absorption cross section^ The 
experimental spectra for the three isomers are very sim- 
ilar to each other with the main peak centered at about 
6.3 eV, a shoulder at about 5.5 eV and a very weak fea- 
ture at 4.5 eV. For 1,2,4-TCB, the isomer with less sym- 
metry, the peak at 4.5 eV is enhanced, the shoulder is 
broader and more pronounced and an extra peak is vis- 
ible at 7 eV. The full BSE spectrum reproduces fairly 
well the position of the peaks in the three isomers. 1,3,5- 
TCB, 1,2,3-TCB show a dark transition at 4.5 eV, that 
acquires strength for the 1,2,4-TCB. In correspondence 
of the 5.5 eV shoulder in the experimental spectra, all 
the theoretical spectra present excitations. In the 1,3,5- 
TCB these are dark, while for 1,2,3-TCB and 1,2,4-TCB 
a shoulder appears in the spectra. The theoretical spec- 
trum of 1,2,4-TC shows also the additional peak around 
7 eV. The neglection of the coupling within the TDA 
blue-shifts by about 0.5 eV the main peak for all the iso- 
mers, with the effect of worsening the agreement with the 
experimental data, especially for what concern the rela- 
tive position of the peaks. The main peak is due to two 
very close excitations due mostly to the n — > it* tran- 
sitions between the doubly degenerate highest occupied 
(HOMO) and the doubly degenerate lowest unoccupied 
molecular orbital (LUMO) and is delocalized over the 
whole molecule. The analysis of the particle-hole pairs 
needed to describe these excitations show an important 
contribution coming from the particle-hole pair with neg- 
ative energy at minus the HOMO-LUMO gap, that is ne- 
glected within the TDA. These results confirm the trend 
of blue-shifting and overestimating the intensity of ex- 
cited state with a more delocalized character (such as 
7r — > 7r* excitations in molecules) that has been already 
observed and discussed in the literature i 12 ! 13 



B. Accuracy of the pseudo-Hermitian Lanczos 
algorithm 

At each iteration the LH algorithm provides an ap- 
proximation for the eigenvalue spectrum of the RPA 
Hamiltonian and thus, through the macroscopic dielec- 
tric function, of the optical absorption spectrum. Fig- 
ure [5] shows how by increasing the number of iterations, 
the spectrum obtained by the LH approach becomes more 
and more accurate, until the results are, for about 300 it- 
erations, indistinguishable on the scale of the plot. Note 
that the spectrum does not converge uniformly for all en- 
ergies, but it converges before in the low energy part, and 
then progressively for higher excitations. In fact if we re- 
strict ourselves to the part of the spectrum examined in 
the experiment would need only about 75 iterations. 

The accuracy can be improved by properly terminating 
the continued fraction in Eqs. (171 HH)) . For the spectra in 
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FIG. 6: [Color online] Top panels: Absorption spectrum of 
1,3,5-TCB obtained by exact diagonalization (gray shadow) 
or by the LH iterative procedure for a different number of 
iterations. Bottom panels: error in spectrum obtained by the 
LH method with respect to exact diagonalization. Note that 
in the right bottom panel the curve have been magnified by a 
factor 5. The matrix elements in Eqs. (|28|) of the Hamiltonian 
have been calculated with Wj,^^ = in Eq. (|30[) and without 
quasiparticle corrections. 



Fig. |5]we just truncated the continued fraction. Instead, 
as suggested in Ref. |2l], in Fig. [7J the asymptotic behavior 
of the continued fraction is described by the terminator 



sH = 



+ b 2 u -b 2 g + .^(UJ 2 +b 2 u - 62)2 _ 4w 2 & 2 
27^2 1 



(31) 



that is obtained by resumming the continued fraction 
corresponding to a tight-binding Hamiltonian with the 
hopping parameters oscillating between two values, b u 
and b g £2- For b n and b g we use the averages of the odd 
and even bi in Eq. @ respectively. In fact we have veri- 
fied that in Eq. ([2]) the odd (even) bi oscillate around its 
asymptotic value b u (b g )£2- 

The terminator improves the quality of the spectrum 
obtained by iteration especially for higher energies (as 
it should be since we are correcting the asymptotic be- 
havior) where it partially eliminates the spurious oscil- 
lations due to the truncation of the continued fraction. 
Using the terminator we then reduce to 200 the number 
of iterations needed for satisfactorly reproducing the ex- 
act diagonalization results in the whole energy range we 
considered. 



C. Efficiency of the pseudo-Hermitian Lanczos 
algorithm 

The computational time is determined by two fac- 
tors: the dimension of the Hamiltonian and the number 
of iterations needed to reach a desired accuracy. First, 
we analyze the behavior of the algorithm with respect 
to the dimension of the Hamiltonian. Figure [5] compares 
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FIG. 7: As Fig. [6] but the terminator in Eq. ((3Tj) has been 
used for the continued fraction in Eqs. (171 124[). 



the timing for exact and iterative diagonalization of the 
BSE using the LH approach, either including (pseudo- 
Hermitian H) or neglecting (Hermitian H) the coupling. 
The number of particle-hole pairs has been gradually in- 
creased so to enlarge the dimension of the Hamiltonian. 
Note that for the TDA Hamiltonian the dimension is 
equal to the particle-hole pairs, while for the full Hamil- 
tonian the dimension is equal to twice the particle-hole 
pairs. For the LH approach the number of iteration is 
kept fixed. 

When the number of particle-hole pairs exceeds 1000, 
the iterative methods become faster than exact diagonal- 
ization: for about 7000 particle-hole pairs the LH itera- 
tive procedure is about two order of magnitude faster for 
the Hermitian case, and three order of magnitude faster 
for the pseudo-Hermitian case. In fact, the computa- 
tional time of diagonalization is increasing rapidly with 
the dimension of the matrix, it grows about two order of 
magnitude while the Hamiltonian grows by about one or- 
der of magnitude. On the other hand the computational 
time of the LH approach is increasing slowly with the 
dimension of the matrix, while the Hamiltonian dimen- 
sion grows of one order of magnitude, the computation 
time remains of the same order of magnitude. Note that 
the non-Hermicity has large impact on the timing of the 
diagonalization, while-thanks to the algorithm presented 
in Fig. HJ-it does not influence the performance of the it- 
erative method considering that the full Hamiltonian has 
twice the dimensions if the TDA one. 

Second, we turn to the number of iterations needed 
to fulfill a given convergence criterium. Figure |H] com- 
pares, fixed the dimension of the Hamiltonian, the error 
on the spectrum with the number of iteration in the Her- 
mitian and pseudo-Hermitian case. We measured the er- 
ror at each iteration j as the frequency- weighted sum (on 
all the frequencies within the considered energy range) 
of the differences with respect to the previous iteration 
J2i \ s i(j) - - l)\/Si(j)/u)i, where Si is the value of 
the spectrum at u>i^ It appears that the LH approach 
applied to the Hermitian TDA Hamiltonian converges to 
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FIG. 8: Time needed to diagonalize the Hamiltonian against 
the number of particle-hole pairs, either by exact diagonaliza- 
tion (using the LAPACK from the Intel Mathematical Kernel 
Library) or by the LH method (200 iterations, 1500 frequen- 
cies in the range between 0-15 eV). For the TDA Hamiltonian, 
the LH in Fig. (TJJ) has been used. For the full Hamiltonian 
both the pseudo-Hermitian LH in Fig. @ and the optimized 
pseudo-Hermitian LH in Fig. Q have been used. The CPU 
time (one core on an Intel/Xeon) has been measured using 
standard C timing routines. 
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FIG. 9: Fixed the Hamiltonian dimension (N=4824), conver- 
gence behavior with the number of iterations of the Hermitian 
and pseudo-Hermitian LH. 

a given threshold about twice as fast than the LH ap- 
proach for the full Hamiltonian. In order to explore the 
causes of the slower convergence for the full Hamiltonian, 
we have considered the Hermitian Hamiltonian 

H no coupl - ( q _° R ?j (32) 

obtained by setting the coupling elements to zero in 
Eq. ([1} , and we have diagonalized it using both the Her- 
mitian and the pseudo-Hermitian LH algorithms. In both 
cases we found the convergence rate is the same as for the 
full Hamiltonian with the coupling. This indicates that 
neither the coupling/non-Hermicity, neither the pseudo- 
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Hermitian algorithm are responsible for the slower con- 
vergence rate. We conjecture that the difference is due 
to different spectrum of the TDA Hamiltonian and the 

H no coupl j n Ref _ [gj] Chen and Q uo have shown for 

the simple Lanczos that fixed the number of iterations 
k, the number of converged eigenvalues m in the Lanc- 
zos method was inversely proportional to the spectral 
range r of the Hamiltonian (defined as r = E max — E m i n , 
where E max / m - m is the largest /smallest eigenvalue of H). 
Considering our particular case, the energy spectrum of 
R in H n ° c^p 1 goes from (approximative estimate from 
independent particle energies) 4.4 eV to 27 eV, instead 

the energy spectrum of R' (of the same dimension N as 
H no coupl) goeg from 4 4 eV tQ 28 eV 35 go p ract i ca ii y the 

energy range of R and R' is very similar, and the en- 
ergy range of H DO coupl goes from -27 eV to 27 eV, that 
is about twice as large than that of R' . This may ex- 
plain the slower convergence rate of the LH method for 

j^no coupl 
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Appendix A: Off-diagonal resolvent matrix elements 
in the Lanczos basis 



V. SUMMARY 



In this work we have explained the theoretical ground 
and detailed the derivation of the pseudo-Hermitian 
Lanczos recursion method for RPA Hamiltonian intro- 
duced in Ref. |l2|. We also discussed, using numerical 
examples, the accuracy and the computational cost of 
the method when compared with conventional diagonal- 
ization techniques and the Hermitian Lanczos recursion 
method. As expected the cost per iteration of the Hermi- 
tian and pseudo-Hermitian (at parity of matrix dimen- 
sion) is the same. On the other hand the number of iter- 
ations needed to reach a desired accuracy in the Lanczos 
eigenvalues spectrum is larger (about the double) of the 
Hermitian case. We have tested that this behavior does 
not depend on either the non-Hermicity of the Hamilto- 
nian or the pseudo-Hermitian algorithm: we argue that 
it is caused by the larger spectral range. In fact the RPA 
spectra includes both negative and positive particle-hole 
pairs, thus it is about the double of the TDA spectra. We 
expect the pseudo-Hermitian LH algorithm for the RPA 
to have particular relevance within computational con- 
densed matter physics and theoretical chemistry, where 
the calculation of the optical response of materials require 
the solution of large eigenproblems for RPA Hamiltonian. 
In f act, as shown briefly in this work, and discussed in 
Ref. [Iolll2Ul3l the treatment of the full RPA Hamiltonian 
is important for accurately reproducing optical spectra of 
molecules, nanostructures and the energy loss function in 
solids. Moreover by using the pseudo-Hermicity property 
it is possible in principle to conveniently reformulate any 
Hermitian algorithm for RPA eigenproblems. 



The off-diagonal resolvent matrix elements in the Lanc- 
zos basis at iteration j, 

Gi fi = { qi \{u-H)- l \u Q ), (Al) 

can be obtained by recursion considering the system of 
linear equation 

(wI-T')G* „=*«,, (A2) 
with T J defined in Eq. © from which it follows 

Gio = {l-(u>- ai )Gi fi )/b 2 , (A3) 
<o = (~ W-2,o - (w - a n )Gi_ h0 )/b n+1 . (A4) 
Alternatively, by direct solution of Eq. (|A2[) one obtains 

G J „,o = (-l) n nr =1 6 J+ i^|^. (A5) 
I A) I 

where = col — T J , and A\ is the minor of obtained 
by deleting the first i rows and columns. This expression 
can be rewritten to obtain a recursion formula as, 

G n,o = -bn+i ' TJ^ <?n-l,0- ( A6 ) 

I Ai| 

Equations (|A6I) and (IA4|) are equivalent [using G J 00 = 
\A{\/\Al\ and the relation \A?\ = (u - Oj+i)|^ +1 | - 
bf +2 \Aj +2 \ in Eq. ifAlj) . one obtains Eq. ([A"6"])]. but they 
may give different results due to the propagation of nu- 
merical error. In fact, since Eq. (|A4[) depends on u, the 
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error i5 on G{ propagates and affects G J n by about 
(uj) n+1 S. Eq. (|A6[) instead, the error on G 3 nQ is pro- 
portional to (|j4„+i \/\Ai \)6. From numerical test we 
have seen that Eq. (|A4|) gives indeed numerical problems 



for large n, thus Eq. (|A6[) has been used to calculate 
the off-diagonal resolvent matrix elements in Eq. (|23p . 
Since |^ +1 |/|^| are computed already to obtain G 3 , 
Eq. (|A6|) does not introduce extra computational cost. 
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